DDPSC Data Science Core, July 2023
pcvr Goalspcvr::conjugatepcvr::conjugate argumentsconjugate outputpcvr GoalsCurrently pcvr aims to:
There is room for goals to evolve based on feedback and scientific needs.
Pre-work was to install R, Rstudio, and pcvr.
Talk about probability, compare to frequentist, introduce theorem.
| Frequentist | Bayesian | |
|---|---|---|
| Fixed | True Effect | Observed Data |
| Random | Observed Data | True Effect |
| Interpretation | P[Data | No Effect] | P[Hypothesis | Observed Data] |
\[\begin{equation} P(\theta|(x_1, \ldots, x_i)) = \frac{\pi(\theta) \cdot L(\theta|(x_1, \ldots, x_i))}{\int \pi(\theta) \cdot L(\theta|(x_1, \ldots, x_i))~d\theta} \end{equation}\]
\[\begin{equation} P(\theta|(x_1, \ldots, x_i)) = \frac{\pi(\theta) \cdot L(\theta|(x_1, \ldots, x_i))}{\int \pi(\theta) \cdot L(\theta|(x_1, \ldots, x_i))~d\theta} \end{equation}\]
\(P(\theta|(x_1, \ldots, x_i))\) = Posterior Distribution (Conclusion as a PDf)
\(\pi(\theta)\) = Prior Distribution (Knowledge as a PDF)
\(L(\theta|(x_1, \ldots, x_i))\) = Likelihood (Data that we collected)
\(\int \pi(\theta) \cdot L(\theta|(x_1, \ldots, x_i))~d\theta\) = Marginal Distribution (this is the problem area)
Solving this integral is potentially a very difficult problem.
Historically there have been two answers:
Solving this integral is potentially a very difficult problem.
Historically there have been two answers:
This may all still seem abstract, so we’ll try to clear it up with two examples.
In the previous example we updated a fundamental verb with context.
Here we’ll update a probability distribution with data.
The P parameter of a Binomial distribution has a Beta conjugate prior.
\[\begin{equation} x_1, \ldots, x_n \sim Binomial(N, P) \\ P \sim Beta(\alpha, \beta) \\ Beta(\alpha', \beta' |(x_1, \ldots, x_n)) = Beta(\alpha, \beta) \cdot L(\alpha, \beta|(x_1, \ldots, x_n)) \\ \alpha` = \alpha + \Sigma(\text{Successes} \in x) \\ \beta` = \beta + \Sigma(\text{Failures} \in x) \end{equation}\]
Generally we can think of conjugacy as when we know the distribution of a parameter in another distribution.
pcvr::conjugateIn pcvr 11 distributions are supported in the conjugate function.
We’ll go over those distributions, what they tend to represent, how they are updated, and what the common alternative tests would be for that kind of data.
pcvr::conjugate| Distribution | Data | Updating | Common Option |
|---|---|---|---|
| Gaussian | Normal | \(\mu', \sigma' \sim N(\mu, \sigma)\) | Z-test |
| T | Normal Means | \(\mu', \sigma_\mu, \nu_\mu' \sim T(\mu, \sigma, \nu)\) | T-test |
| Lognormal | Positive Right Skewed | \(\mu' \sim N(\mu, \sigma)\) | Wilcox |
| Lognormal2 | Positive Right Skewed | \(\rho' \sim \Gamma(A, B)\) | Wilcox |
| Beta | Percentages | \(\alpha', \beta' \sim \alpha, \beta + Counts\) | Wilcox |
| Binomial | Success/Failure Counts | \(P \sim Beta(\alpha, \beta)\) | Wilcox/logistic regression |
| Poisson | Counts | \(\lambda \sim Gamma(A, B)\) | Wilcox/glm |
| Neg-Binom. | Overdispersed Counts | \(P \sim Beta(\alpha, \beta)|r\)) | Wilcox/glm |
| Von Mises | Circular | \(\mu', \kappa'^* \sim VMf(\mu, \kappa)\) | Watsons |
| Uniform | Positive Flat | \(Upper \sim Pareto(A, B)\) | Wilcox |
| Pareto | Heavy Tail | \(Shape \sim \Gamma(A, B)| Loc.\) | Wilcox |
| Gamma | Right Skew | \(Rate \sim \Gamma(A, B)| Shape\) | Wilcox |
| Bernoulli | Logical | \(Rate \sim Beta(\alpha, \beta)\) | Logistic Regression |
| Exponential | Right Skew | \(Rate \sim \Gamma(A, B)\) | Wilcox/glm |
pcvr::conjugatepcvr::conjugate argumentsconjugate takes one or two sets of SV (numeric) or MV (matrix/df) data. Alternatively this can be a formula and a dataframe, similar to stats::t.test.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentsThe method argument specifies the distribution to be used. See ?conjugate for further details.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentsThe priors argument allows you to specify the prior distribution. If left NULL then default priors will be used
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate default priorspcvr::conjugate argumentsThe plot argument controls whether or not a ggplot is made of the results. See later examples.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentsThe rope_range and rope_ci arguments allow region-of-practical-equivalence (ROPE) testing using the difference in the posterior distributions if two samples are given.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentscred.int.level controls the credible intervals that are calculated on the posterior distributions. The default of 89% is arbitrary.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentsThe hypothesis argument sets which hypothesis is tested out of “greater”, “lesser”, “equal” and “unequal”. These are read as “s1 equal to s2”, etc.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)pcvr::conjugate argumentsThe support argument optionally lets you set the support range for the distribution. In practice this exists for internal use and is not something you would generally want to set.
pcvr::conjugate(
s1 = NULL, s2 = NULL,
method = c(
"t", "gaussian", "beta",
"binomial", "lognormal", "lognormal2",
"poisson", "negbin",
"uniform", "pareto",
"vonmises", "vonmises2"
),
priors = NULL,
plot = FALSE,
rope_range = NULL, rope_ci = 0.89,
cred.int.level = 0.89,
hypothesis = "equal",
support = NULL
)conjugate outputLastly we’ll show a few interpretations of conjugate output in the plant phenotyping context.
Germination Rates (or other binary outocmes like flowering or death) can make good sense as Beta-Binomial data.
Here we’d simply conclude that there is about a 19% chance that the germination rate is the same between these two genotypes after 1 week. We could do more with ROPE testing, but we’ll do that in the next example.
Lots of phenotypes are gaussian and conjugate can be used similarly to a T-test with the “t” method. Consider area data that looks like this example.
Here we include a ROPE test corresponding to our belief that any difference in Area of \(\pm2 cm^2\) is biologically insignificant. We also show the formula syntax and use non-default priors here (since default priors include negative values which can’t happen with area).
Our plot shows about a 83% chance that these distributions are unequal and a 24% chance that the difference in means is within \(\pm2 cm^2\).
The other aspects of the output are a summary and the posterior parameters.
$summary
[1] "data.frame"
$posterior
[1] "list"
$plot
[1] "patchwork" "gg" "ggplot"
The summary is a data.frame with a summary of the information in the plot.
The other aspects of the output are a summary and the posterior parameters.
$summary
[1] "data.frame"
$posterior
[1] "list"
$plot
[1] "patchwork" "gg" "ggplot"
The posterior is the prior list updated with the given data, this allows for Bayesian updating hypothetically.
There are also several phenotypes that are counts. Numbers of vertices, leaves, flowers, etc could all be used with one of the count distributions. Here we consider Poisson distributed leaf counts between two genotypes.
Here we model \(X \sim Poisson(\lambda)\\ \lambda \sim \Gamma(A, B)\)
We can comfortably say that the difference in the posteriors is not in [-1, 1] and there is a 91% chance that the Gamma distributions for \(\lambda\) are different.
Finally, we’ll show an example using what is likely the least-familiar distribution in conjugate, the Von-Mises distribution.
The Von-Mises distribution is a symmetric circular distribution defined on \([-\pi, \pi]\).
To use Von-Mises with data on other intervals there is a boundary element in the prior that is used to rescale data to radians for the updating before rescaling back to the parameter space. See ?conjugate for more examples of the boundary.
Note this is a very exaggerated example for the plant phenotyping setting since green happens to be in the middle of the hue circle, which wraps in the red colors.
If you do have wrapped circular data then looking at it in a non-circular space like this would be a problem. For values we normally get from plants other continuous methods can generally be a decent approximation.
Here our distributions are very similar and there is about a 79% chance that their parameters are the same given this data and our (very wide) priors.
Note that if you use data in a different boundary space than radians the rope_range would be given in the observed parameter space, not in radians.
Hopefully this has been a useful introduction to Bayesian conjugacy.
Please direct questions/issues with pcvr::conjugate to the pcvr github issues or help-datascience slack channel for DDPSC users.